Search CORE

15 research outputs found

Recommended from our members

Predictive Complexity Priors

Author: Gordon J
Hernández-Lobato JM
Nalisnick E
Publication venue: Proceedings of Machine Learning Research
Publication date: 01/01/2021
Field of study

Specifying a Bayesian prior is notoriously difficult for complex models such as neural networks. Reasoning about parameters is made challenging by the high-dimensionality and over-parameterization of the space. Priors that seem benign and uninformative can have unintuitive and detrimental effects on a model's predictions. For this reason, we propose predictive complexity priors: a functional prior that is defined by comparing the model's predictions to those of a reference model. Although originally defined on the model outputs, we transfer the prior to the model parameters via a change of variables. The traditional Bayesian workflow can then proceed as usual. We apply our predictive complexity prior to high-dimensional regression, reasoning over neural network depth, and sharing of statistical strength for few-shot learning

Apollo (Cambridge)

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Bayesian batch active learning as sparse subset approximation

Author: Gordon J.
Hernández-Lobato J.M.
Nalisnick E.
Pinsler R.
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2020
Field of study

International Migration, Integration and Social Cohesion online publications

Bayesian batch active learning as sparse subset approximation

Author: Gordon J.
Hernández-Lobato J.M.
Nalisnick E.
Pinsler R.
Publication venue: Neural Information Processing Systems Foundation
Publication date: 01/01/2020
Field of study

Leveraging the wealth of unlabeled data produced in recent years provides great potential for improving supervised models. When the cost of acquiring labels is high, probabilistic active learning methods can be used to greedily select the most informative data points to be labeled. However, for many large-scale problems standard greedy procedures become computationally infeasible and suffer from negligible model change. In this paper, we introduce a novel Bayesian batch active learning approach that mitigates these issues. Our approach is motivated by approximating the complete data posterior of the model parameters. While naive batch construction methods result in correlated queries, our algorithm produces diverse batches that enable efficient active learning at scale. We derive interpretable closed-form solutions akin to existing active learning procedures for linear models, and generalize to arbitrary models using random projections. We demonstrate the benefits of our approach on several large-scale regression and classification tasks.Comment: NeurIPS 201

arXiv.org e-Print Archive

International Migration, Integration and Social Cohesion online publications

UvA-DARE

On the impact of non-IID data on the performance and fairness of differentially private federated learning

Author: Amiri S.
Belloum A.
Gommans L.
Klous S.
Nalisnick E.
Publication venue: IEEE Computer Society
Publication date: 01/01/2022
Field of study

International Migration, Integration and Social Cohesion online publications

UvA-DARE

Enhancing VAEs for Collaborative Filtering: Flexible Priors & Gating Mechanisms

Author: Alemi A.
Hoffman M. D.
Hsu W. N.
Nalisnick E.
Sønderby C. K.
Tomczak J.
Van den Oord A.
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 03/11/2019
Field of study

Neural network based models for collaborative filtering have started to gain attention recently. One branch of research is based on using deep generative models to model user preferences where variational autoencoders were shown to produce state-of-the-art results. However, there are some potentially problematic characteristics of the current variational autoencoder for CF. The first is the too simplistic prior that VAEs incorporate for learning the latent representations of user preference. The other is the model's inability to learn deeper representations with more than one hidden layer for each network. Our goal is to incorporate appropriate techniques to mitigate the aforementioned problems of variational autoencoder CF and further improve the recommendation performance. Our work is the first to apply flexible priors to collaborative filtering and show that simple priors (in original VAEs) may be too restrictive to fully model user preferences and setting a more flexible prior gives significant gains. We experiment with the VampPrior, originally proposed for image generation, to examine the effect of flexible priors in CF. We also show that VampPriors coupled with gating mechanisms outperform SOTA results including the Variational Autoencoder for Collaborative Filtering by meaningful margins on 2 popular benchmark datasets (MovieLens & Netflix)

arXiv.org e-Print Archive

Crossref

Calibrated Learning to Defer with One-vs-All Classifiers

Author: Nalisnick E.
Verma R.
Publication venue
Publication date: 01/01/2022
Field of study

The learning to defer (L2D) framework has the potential to make AI systems safer. For a given input, the system can defer the decision to a human if the human is more likely than the model to take the correct action. We study the calibration of L2D systems, investigating if the probabilities they output are sound. We find that Mozannar & Sontag's (2020) multiclass framework is not calibrated with respect to expert correctness. Moreover, it is not even guaranteed to produce valid probabilities due to its parameterization being degenerate for this purpose. We propose an L2D system based on one-vs-all classifiers that is able to produce calibrated probabilities of expert correctness. Furthermore, our loss function is also a consistent surrogate for multiclass L2D, like Mozannar & Sontag's (2020). Our experiments verify that not only is our system calibrated, but this benefit comes at no cost to accuracy. Our model's accuracy is always comparable (and often superior) to Mozannar & Sontag's (2020) model's in tasks ranging from hate speech detection to galaxy classification to diagnosis of skin lesions.Comment: Accepted at the International Conference on Machine Learning (ICML), 202

arXiv.org e-Print Archive

UvA-DARE

International Migration, Integration and Social Cohesion online publications

Dropout as a structured shrinkage prior

Author: Hernández-Lobato JM
Nalisnick E
Smyth P
Publication venue
Publication date: 01/01/2019
Field of study

Dropout regularization of deep neural networks has been a mysterious yet effective tool to prevent overfitting. Explanations for its success range from the prevention of "co-adapted" weights to it being a form of cheap Bayesian inference. We propose a novel framework for understanding multiplicative noise in neural networks, considering continuous distributions as well as Bernoulli noise (i.e. dropout). We show that multiplicative noise induces structured shrinkage priors on a network's weights. We derive the equivalence through reparametrization properties of scale mixtures and without invoking any approximations. Given the equivalence, we then show that dropout's Monte Carlo training objective approximates marginal MAP estimation. We leverage these insights to propose a novel shrinkage framework for resnets, terming the prior automatic depth determination as it is the natural analog of automatic relevance determination for network depth. Lastly, we investigate two inference strategies that improve upon the aforementioned MAP approximation in regression benchmarks

CUED - Cambridge University Engineering Department